11 research outputs found

    What is the IQ of your data transformation system?

    Full text link
    Mapping and translating data across different representations is a crucial problem in information systems. Many formalisms and tools are currently used for this purpose, to the point that devel- opers typically face a difficult question: “what is the right tool for my translation task?” In this paper, we introduce several techniques that contribute to answer this question. Among these, a fairly gen- eral definition of a data transformation system, a new and very effi- cient similarity measure to evaluate the outputs produced by such a system, and a metric to estimate user efforts. Based on these tech- niques, we are able to compare a wide range of systems on many translation tasks, to gain interesting insights about their effective- ness, and, ultimately, about their “intelligence”

    Quality of mappings for data exchange applications

    No full text
    Research has investigated mappings among data sources under two perspectives. On one side, there are studies of practical tools for schema mapping generation; these focus on algorithms to generate mappings based on visual specifications provided by users. On the other side, we have theoretical researches about data exchange. These study how to generate a solution -- i.e., a target instance -- given a set of mappings usually specified as tuple generating dependencies. However, these two research lines have progressed in a rather independent way and we are still far away from having a complete understanding of the properties that a "good" schema mapping system should have; to give an example, there are many possible solutions for a data exchange problem. In fact, there is no consensus yet on a notion of quality for schema mappings. In this thesis, based on concepts provided by schema mapping and data exchange research, we aim at investigate such a notion. Our goal is to identify a fairly general formal context that incorporates the different mapping-generation systems proposed in the literature, and to develop algorithms, tools and methods for characterizing the quality of mappings generated by those systems

    ATOM: Automatic Target-driven Ontology Merging

    No full text
    Abstract — The proliferation of ontologies and taxonomies in many domains increasingly demands the integration of multiple such ontologies to provide a unified view on them. We demonstrate a new automatic approach to merge large taxonomies such as product catalogs or web directories. Our approach is based on an equivalence matching between a source and target taxonomy to merge them. It is target-driven, i.e. it preserves the structure of the target taxonomy as much as possible. Further, we show how the approach can utilize additional relationships between source and target concepts to semantically improve the merge result. I

    Core Schema Mappings

    No full text
    Research has investigated mappings among data sources under two perspectives. On one side, there are studies of practical tools for schema mapping generation; these focus on algorithms to generate mappings based on visual specifications provided by users. On the other side, we have theoretical researches about data exchange. These study how to generate a solution – i.e., a target instance – given a set of mappings usually specified as tuple generating dependencies. However, despite the fact that the notion of a core of a data exchange solution has been formally identified as an optimal solution, there are yet no mapping systems that support core computations. In this paper we introduce several new algorithms that contribute to bridge the gap between the practice of mapping generation and the theory of data exchange. We show how, given a mapping scenario, it is possible to generate an executable script that computes core solutions for the corresponding data exchange problem. The algorithms have been implemented and tested using common runtime engines to show that they guarantee very good performances, orders of magnitudes better than those of known algorithms that compute the core as a post-processing step

    Concise and Expressive Mappings with +Spicy

    No full text
    We introduce the +Spicy mapping system. The system is based on a number of novel algorithms that contribute to increase the quality and expressiveness of mappings. +Spicy integrates the computation of core solutions in the mapping generation process in a highly efficient way, based on a natural rewriting of the given mappings. This allows for an efficient implementation of core computations using common runtime languages like SQL or XQuery and guarantees very good performances, orders of magnitude better than those of previous algorithms. The rewriting algorithm can be applied both to mappings generated by the system, or to pre-defined mappings provided as part of the input. To do this, the system was enriched with a set of expressive primitives, so that +Spicy is the first mapping system that brings together a sophisticate and expressive mapping generation algorithm with an efficient strategy to compute core solutions. 1

    Core schema mappings: Scalable core computations in data exchange

    No full text
    Research has investigated mappings among data sources under two perspectives. On one side, there are studies of practical tools for schema mapping generation; these focus on algorithms to generate mappings based on visual specifications provided by users. On the other side, we have theoretical researches about data exchange. These study how to generate a solution – i.e., a target instance – given a set of mappings usually specified as tuple generating dependencies. Since the notion of a core solution has been formally identified as an optimal solution, it is very important to efficiently support core computations in mapping systems. In this paper we introduce several new algorithms that contribute to bridge the gap between the practice of mapping generation and the theory of data exchange. We show how, given a mapping scenario, it is possible to generate an executable script that computes core solutions for the corresponding data exchange problem. The algorithms have been implemented and tested using common runtime engines to show that they guarantee very good performances, orders of magnitudes better than those of known algorithms that compute the core as a post-processing step.

    The Spicy system: Towards a Notion of Mapping Quality

    No full text
    We introduce the Spicy system, a novel approach to the problem of automatically selecting the best mappings among two data sources. Known schema mapping algorithms rely on value correspondences -- i.e. correspondences among semantically related attributes -- to produce complex transformations among data sources. Spicy brings together schema matching and mapping generation tools to further automate this process. A key observation, here, is that the quality of the mappings is strongly influenced by the quality of the input correspondences. To address this problem, Spicy adopts a three-layer architecture, in which a schema matching module is used to provide input to a mapping generation module. Then, a third module, the mapping verification module, is used to check candidate mappings and choose the ones that represent better transformations of the source into the target. At the core of the system stands a new technique for comparing the structure and actual content of trees, called structural analysis. Experimental results show that our mapping discovery algorithm achieves both good scalability and high precision in mapping selection
    corecore